Language and personality in CMC 1 Running head: LANGUAGE AND PERSONALITY IN CMC Language and Personality in Computer-Mediated Communication: A cross-genre comparison

نویسندگان

  • Alastair J. Gill
  • Scott Nowson
  • Jon Oberlander
چکیده

Computer-mediated communication (CMC) is often considered distinct from traditional written language. In this paper we examine language use in two computer-mediated environments (e-mail and weblogs) and compare it to that in non-CMC written texts. Previous work has used the LIWC text analysis tool to derive broad language factors for written texts. Here, we find that the linguistic factor structure of e-mail is similar to that of weblogs, with both broadly replicating previous results from off-line studies. However, we note that blogs differ from non-CMC language to a greater extent than e-mail. Previous work has shown that the linguistic factors are related to author personality. Here, we find that Neuroticism and Agreeableness distinguish CMC from non-CMC environments, while noting that results from the EPQ-R and Five-Factor personality measures are largely compatible. Language and personality in CMC 3 Language and Personality in Computer-Mediated Communication: A cross-genre comparison Is the structure of offline written language significantly different to that of computer-mediated communication (CMC)? In this paper we investigate whether the broad language factors derived from written texts using the LIWC text analysis tool (Pennebaker & King, 1999) can be replicated in CMC. However, it is arguably wrong to regard CMC as a single—largely homogeneous—genre. We therefore examine language features derived from both e-mail and personal weblogs and compare these to the findings for offline written language. Given that personality relates to broad factors of written language use, we also test whether these relationships hold up in the computer-mediated environment, and the effects of different personality models. Why should the computer-mediated environment influence language use? Computer-mediated communication, in particular e-mail and internet relay chat, is widely noted for having characteristic features (e.g., short telegraphic phrases, slang and abbreviations such as LOL, or emoticons), which have been adopted to convey paralinguistic or social cues (Werry, 1996; Hancock & Dunham, 2001; Colley & Todd, 2002). These features are not shared with traditional writing, despite their both providing less rich information than face-to-face communication (Panteli, 2002; see also Spears, Lea, & Postmes, 2001, for theories of why people communicate differently in CMC). Indeed, many of us experience the effects of CMC on communication and different individuals: It offers anonymity, so that people are more comfortable participating in interactions (Bloch, 2002; Yellen, Winniford, & Sanford, 1995), or even engaging in deception (Hancock, Curry, Goorha, & Woodworth, 2006). Further, the liberating effect of CMC can also be noted in the intimacy of topics discussed in weblog and e-mail contexts, and perhaps even more notably in online personal advertisements (Groom & Pennebaker, Language and personality in CMC 4 2005). Therefore, it may well be that different personalities respond in distinctive ways to such an environment. We examine this later in the paper. In what follows, we first discuss the two CMC genres which this paper focuses upon, e-mail and weblogs. We then review the background to the LIWC text analysis technique which we adopt, along with the previous findings of (Pennebaker & King, 1999), who derived broad language factors for written language, and related these to personality. After noting some further studies which have related language to personality types, we then provide additional detail on personality psychology, and in particular the two main models used to describe personality types in this paper. We note some specific findings relating personality to CMC behaviour, and then frame our current research questions. We present our data used in the subsequent experiments: Experiment 1 which derives language factors from the LIWC variables, and Experiment 2 which relates these language factors to personality. We discuss the findings from these experiments together in our general discussion, and we finish with conclusions and implications for future work. Internet-based CMC should not be treated as a single genre, and is in fact composed of a number of distinct types of communication (Yates, 1996). For example, static webpages are for the most part wholly written, but instant messengers create written (but ephemeral) conversations that mimic spoken forms in many ways. Relatively stable varieties of internet CMC are emerging, although there is still variation within them, and new forms are continually evolving (Cho, 1996; Gruber, 2000). However, this has not prevented researchers from attempting to classify these forms of communication, either functionally or linguistically (Crowston & Williams, 2000; Shepherd, Watters, & Kennedy, 2004; Santini, 2005; Biber, 1988, 2004). E-mail is one of the major contributors to the popularity of the internet. For example, it has been estimated that 73% of American adults access the internet (Madden, 2006), and that 90% of internet users access e-mail (Fallows, 2005) and 30 Language and personality in CMC 5 billion emails are sent daily (Fallows, 2004). It is a written form, in which interlocutors are physically separated; it is also durable, and authors often use complex linguistic constructions; however, e-mail is often unedited, uses firstand second-person pronouns, present tense and contractions, and it is generally informal in tone (Bälter, 1998; Baron, 2001). Indeed, characteristic e-mail features have been identified, including ellipses, capitalisation, extensive use of exclamation marks, and question marks (termed ‘e-mailisms’; Colley & Todd, 2002). As a result, e-mail is often considered intermediate in form between speech and writing (Yates, 1996; Baron, 1998; Gruber, 2000; Nowson, Oberlander, & Gill, 2005, cf. Collot & Belmore, 1996). However, e-mail certainly differs from speech, being more verbose, yet less emotional (Whittaker, 2003). More recently, weblogs have rapidly grown in popularity. It has been estimated that over 12 million blogs are regularly updated by American adults alone (Lenhart & Fox, 2006). Weblogs are frequently updated websites relating to any topic, and fall between static HTML webpages and asynchronous forms of CMC such as newsgroups. The most popular form of weblogs are those of a personal journal style, which are commonly referred to as ‘blogs’ (Herring, Kouper, Scheidt, & Wright, 2004). Like e-mail, blogs can also demonstrate characteristic linguistic features, with posts written in ‘short, paratactic sentences’ employing ‘informal, non-standard constructions and slang’ (Nilsson, 2003). However, due to the social nature of blogging, it is also possible to observe communities and social groups, including in-group and out-group language behaviour (such as, I, me, my, we, us and our, rather than they, them and their) and shared background knowledge and concepts (Nilsson, 2003; Cassell & Tversky, 2005, cf. Brown & Yule, 1983), as well as shared responses to traumatic events such as September 11, 2001 (Cohn, Mehl, & Pennebaker, 2004; Krishnamurthy, 2002). In this paper, we examine the fundamental similarity of e-mail and blog CMC varieties, and compare these back to previous studies of offline written language. Language and personality in CMC 6 We now turn to the methodology adopted in this paper, content analysis, which focuses on context-independent occurrences of lexical content words in written text. Although there are many different approaches (see Mehl, 2005, Lowe, 2004, Pennebaker, Mehl, & Niederhoffer, 2003; Smith, 1992, for an overview; and; Oberlander & Gill, 2006 and Mehl & Gill, forthcoming, for a comparison between these methods and corpus-based approaches), here we describe one method in particular, which uses the Linguistic Inquiry and Word Count text analysis program (LIWC; Pennebaker & Francis, 1999).1 LIWC works by counting occurrences of words or word-stems of pre-defined semantic and syntactic categories whihc are divided into four main groups: Linguistic Dimensions, Psychological Processes, Relativity, and Personal Concerns. For instance, using this system, words like could, should and would are categorised as ‘discrepancies’, allowing the overall percentage of ‘discrepancy’ words to be calculated for the text as a whole. One of the major strengths of the LIWC text analysis approach is that the dictionaries have been rated by judges (Pennebaker & Francis, 1999; Pennebaker & King, 1999), however the downside is that syntactic features are not derived from part-of-speech analysis of the texts (cf. Wmatrix Rayson, 2003, and Gill, 2004; Oberlander & Gill, 2006 Mehl & Gill, forthcoming). LIWC has been used to examine the relationship between language and a large range of social and psychological phenomena, such as health and well-being (Pennebaker, Mayne, & Francis, 1997; Pennebaker, 1997; Graybeal, Sexton, & Pennebaker, 2002; cf. Oxman, Rosenberg, Schnurr, & Tucker, 1988), deception (Newman, Pennebaker, Berry, & Richards, 2003), gender (Mehl & Pennebaker, 2003), emotional tone (in newsgroups, Joyce & Kraut, 2006), interpersonal affiliations (Cassell & Tversky, 2005), and individual differences (Pennebaker & King, 1999). We now discuss the last of these studies in greater detail, since we aim to compare our CMC data with Pennebaker and King (1999)’s findings which derived language factors from written texts. Language and personality in CMC 7 Pennebaker and King (1999) analysed texts written by authors for whom five-factor personality information was available. The studies used multiple writing samples produced by over 800 participants of undergraduate level summer schools. Factor analysis was used to derive a small number of factors grouping individual LIWC variables and these were then correlated with writers’ scores on personality dimensions. We note the similarity of this approach to that which Biber used to explore genre (e.g., Biber, 1995, cf. Lee, forthcoming). The four derived factors were: ‘Immediacy’, ‘Making Distinctions’, ‘The Social Past’, and ‘Rationalization’. The ‘Immediacy’ factor consisted of loadings of the total number of first-person singular words (e.g., I, me, my), fewer articles (a, an, the), fewer long words, present-tense verbs, and discrepancies (would, could, should); Pennebaker and King cite Mehrabian’s (1967) verbal non-immediacy in relation to its naming. ‘Making Distinctions’ included loadings of the LIWC dictionaries exclusive words (but, without, except), tentative words (perhaps, maybe), negations (no, not, never), fewer inclusion words (and, with), and also a secondary loading of discrepancies. The third factor, ‘The Social Past’, was characterised by a high use of part tense verbs, fewer present tense verbs, fewer positive emotion words, and high social reference. The fourth and final factor to emerge from the original study was ‘Rationalization’, this featured loadings of causation words (because, reason), insight words (understand, realise), and fewer negative emotion words. Pennebaker and King correlated these language factors with personality scores and found the following relationships: Extraverts used language associated with ‘The Social Past’, and avoided language associated with ‘Making Distinctions’; Neurotic individuals used language associated with ‘Immediacy’; Individuals scoring high in Openness used language associated with ‘The Social Past’ (and some related to ‘Making Distinctions’), and avoided language associated with ‘Immediacy’; High Agreeableness scorers used language associated with ‘Immediacy’; High Conscientiousness scorers avoided language Language and personality in CMC 8 associated with ‘Making Distinctions’. A more recent study has used LIWC analysis to analyse speech sampled from everyday interactions (Mehl, Gosling, & Pennebaker, 2006). Overall, this showed that Extraverts have a higher word count, with shorter words; Neurotics have a lower word count; people with higher levels of Openness talk less about social processes, use fewer past tense words and third-person pronouns; individuals higher in Agreeableness use more first-person pronouns, fewer articles and fewer swear words; and those higher in Conscientiousness use fewer words relating to negative emotions and swearing. We now briefly note further approaches to the study of personality and communication. These have mainly examined Extravertion, with the classification of utterances into speech acts finding that, for example, Extraverts: initiate more laughter; express more pleasure talk, agreement, and compliments; use more self-referent statements; and talk more, focusing on extra-curricular activities. Introverts, on the other hand, use more language relating to hedges and problem talk (Gifford & Hine, 1994; Thorne, 1987). Additionally, Extraverts have been shown to talk more in at least some situations (Carment, Miles, & Cervin, 1965; cf. Thorne, 1987) We have already mentioned personality, but before going any further we discuss this topic in more depth. In this paper we refer to two main models of personality, which hold that individual core characteristics (traits) of an individual are relatively stable over time; an alternative approach is to view personality as a social construct which varies according to context (Matthews, Deary, & Whiteman, 2003). These trait-based measures of personality are: the five-factor model (FFM; Digman, 1990; Costa & McCrae, 1992; Wiggins & Pincus, 1992; Goldberg, 1993) —as used by Pennebaker and King (1999)—and Eysenck’s three-factor model (H. Eysenck & Eysenck, 1991; S. Eysenck, Eysenck, & Barrett, 1985). Both of these describe Extraversion (Extraversion–Introversion) and Neuroticism (Emotionality–Stability) which are undisputed and central to both theories of Language and personality in CMC 9 personality. To these, the five-factor model adds Openness, Agreeableness and Conscientiousness, while the three-factor model adds the trait of Psychoticism, while the three-factor model adds the trait of Psychoticism (Matthews et al., 2003; Lippa & Dietz, 2000). Extraversion in one of the most salient and visible personality traits (Funder, 1995), and one of the few which researchers generally agree provides ‘consistent and valid information’ (Jonassen & Grabowski, 1993). High scorers on this trait are characterised by their sociability, energy, optimism and impulsivity, whereas low scorers are regarded as being quiet, retiring, and cautious (H. Eysenck & Eysenck, 1975). Turning to the other core dimension, Neuroticism: The highly Neurotic individual is a worrier with a strong emotional reaction of anxiety to such thoughts. On the other hand, the stable, low Neuroticism scorer, is calm, even-tempered, controlled and unworried (H. Eysenck & Eysenck, 1975). With regard to the remaining traits, we take EPQ-R Psychoticism to be inversely related to Agreeableness and Conscienciousness of the five-factor model (Kline, 1993), like Larstone, Jang, Livesley, Vernon, and Wolf (2002) noting that ‘each instrument is an imperfect measure of personality that shares components of variance with the other while also tapping specific dimensions’. A high Psychoticism scorer can be described as a ‘loner’, not easily fitting in anywhere, being hostile and aggressive, and having a liking for strange or unusual things; the low scorer is regarded as showing the opposite characteristics, such as warmth, kindness and compassion (H. Eysenck & Eysenck, 1975). Indeed, subsequent research has suggested that it is indicative of a thoughtless or reckless personality, and that high Psychoticism scorers appear to have experienced an excess of severe and threatening life events (Pickering et al., 2003). Agreeableness relates to an individual’s ‘pleasantness’ in high scorers, or ‘disagreeableness’ in low scorers. Costa and McCrae (1992) describe the facets which compose Agreeableness as follows: Trust, Language and personality in CMC 10 Straightforwardness, Altruism, Compliance, Modesty, and Tender-Mindedness. For Conscientiousness, the five-factor model facets are Competence, Order, Dutifulness, Achievement Striving, Self-Discipline, and Deliberation; in contrast to the highly conscientious individual, low scorers on this trait could be described as ‘disorganised’ and ‘lackadaisical’. The final trait of the five-factor model is Openness. Costa and McCrae (1992) describe the facets of this trait as: Fantasy, Aesthetics, Feelings, Actions, Ideas, and Values. A high Openness scorer would therefore see themselves as ‘creative’ and ‘open-minded’, with a low scorer being more of a ‘practical’ individual. Finally, in drawing the different strands of this paper together, namely the interaction between types of computer-mediated communication and language, and then how this relates to personality, we note that the following research which has examined personality and CMC: For example, e-mail can provide a more comfortable environment for Introverts (Yellen et al., 1995; Bloch, 2002), however it is also widely used by the population more generally. In the case of weblogs, Nowson (2006); Nowson and Oberlander (2006); Nowson et al. (2005) observe that bloggers show slightly higher levels of Extraversion and Agreeableness, and also a strongly skewed distribution towards higher levels of Openness, however these results are difficult to compare with established norms collected from different populations (Buchanan, 2005, 2001).

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Language and Personality in Computer-Mediated Communication: A cross-genre comparison

It is known that personality is important in computer-mediated communication (CMC), influencing both how we express ourselves, and how we are perceived. Here we build in two ways on previous work which has used the LIWC text analysis tool to derive language factors relating to personality. First, we investigate whether linguistic factor structure in e-mail is similar to that in weblogs, and how...

متن کامل

The Language of Weblogs: A study of genre and individual differences

This thesis describes a linguistic investigation of individual differences in online personal diaries, or ‘blogs.’ There is substantial evidence of gender differences in language (Lakoff, 1975), and to a lesser extent linguistic projection of personality (Pennebaker & King, 1999). Recent work has investigated these latter differences in the area of computer-mediated communication (CMC), specifi...

متن کامل

The Relationship Between Personality Traits and Psychological Capital with the Mediation of Body Language in University Students

Introduction: The present study aimed to investigate the relationship between personality traits and psychological capital through the mediation of body language. Methods: This was a descriptive/correlational method study. The statistical population consisted of all eight-teen thousands students of the Islamic Azad university of Kerman. Three hundred eighty people were selected by simple random...

متن کامل

The Impact of Language on Personality Ethic as a Social Paradigm

This study aimed to explore the role of language type in personality ethic- as a social paradigm. To do so, 30 Iranian advanced bilingual EFL university students were selected based on their performance on the OPT. Then, they were asked to respond to an ethical survey as modelled by Poulshock in two Persian and English versions at the time interval of one month. Their responses to both versions...

متن کامل

On the Relationship between Language Learning Strategies and Personality Types among Iranian EFL Learners

In recent years, language-learning research has been more concerned with factors that may affect the choice of learning strategies among learners. In studies conducted by Cohen (1990), Ehrman and Oxford (1995), MacIntyre and Gardner (1989) and Reid (1987), these factors have been identified as motivation, gender, learning style, previous experience, and personality type. This study attempts to ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007